Close its source exchanges once a stage finishes execution#16446
Close its source exchanges once a stage finishes execution#16446arhimondr merged 1 commit intotrinodb:masterfrom
Conversation
|
Through testing, this PR decreased CPU time and wall time by around 3.2%-3.4% for tpch-sf10000 standard suite (22 queries), concurrency = 5 on a cluster of 7 workers and 7 buffer data nodes. |
There was a problem hiding this comment.
nit: Do we need to check !stageExecution.isClosed() here?
There was a problem hiding this comment.
Yes, it's a pruning, so we don't call abort on the same stage for multiple times.
There was a problem hiding this comment.
This is what confuses me a little. You are checking the parent stage, but closing child stages. So you still may end up closing child stages several times until the parent stage itself is closed. Given the close is cheap (as you are checking if it's already closed) i wonder if it is worth extra complexity? (as it doesn't protect from close being called more than once but may create such an impression)
There was a problem hiding this comment.
I'd say closeSourceExchanges is not super cheap as it involves a loop which involves getStageId as well as a hashset lookup. Ideally we want to prune if possible.
58ab4d9 to
7c7a079
Compare
|
Failure is unrelated |
Description
For fault-tolerant execution, today we only clean up the spooling files at the end of a query. Actually we can clean up earlier --- once we are sure that the data won't be read again, then we can delete.
Additional context and related issues
N/A
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: